Appl. No. 10,692,222 

Amdt. dated October 21, 2008 

Reply to Office action of September 26, 2008 

Amendments to the Specification: 

Please replace paragraph page 9 line 21 to page 10 line 1 with the 
following amended paragraph: 

[Non-patent document 1] 

WTP (WePSphere Transcoding Publisher, 
http ://www - 6. i bm . com/jp/softwaro/notwork/transcod i ng/) 

Please replace paragraph page 10 lines 2-7 with the following amended 
paragraph: 

[Non-patent document 2] 

CHIP[I] Ito "Construction method of distributed applications by integration 
of GUI parts and WEB services," Japan Society for Software Science and 
Technology WISS 2001 Proceedings 
(http://ca.momo.hokuda i .ac.jp/poop l o/itok/CH I P/ i ndoxJ.htm l ) 

Please replace paragraph page 10 lines 8-10 with the following 
amended paragraph: 

[Non-patent document 3] 

IBM mySite Outliner 
(http://www - 6. i bm.com/jp/pc/c l ub i bm/mso l /indox.shtm l ) 

Please replace paragraph page 10 lines 11,12 with the following 
amended paragraph: 

[Non-patent document 4] 
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DiffWeb (http://www.d i ffw e b.com/) 

Please replace paragraph page 10 lines 13 and 14 with the following 
amended paragraph: 

[Non-patent document 5] 

HTML Diff (http://www db.stanford.odu/c3/c3.htm l) 

Please replace paragraph page 10 lines 15 and 16 with the following 
amended paragraph: 

[Non-patent document 6] 

Mindlt (http://m i nd i t.notm i nd.com/m i nd i t.shtm l ) 

Please replace paragraph page 25, lines 14-20 with the following 
amended paragraph: 

For example, the next URLs are listed as adjacent structured/hierarchical 
contents to a Web content as the target content of which URL is 
http://www. asahi.com/0606/news/ national0601 5 .htm l . 
http://www. asahi.com/0606/news/national06012Tfitm4 
http://www. asahi.com/0606/news/national06013Tntm1 
http://www. asahi.com/0606/news/national06014Tntm1 

Please replace paragraph page 26 lines 5-18 with the following amended 
paragraph: 

For example, the next URLs are listed as adjacent structured/hierarchical 
contents to a Web content as the target content of which URL is 
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http://www. cnn.com/2000/US/06/05/ 
sea. based. defense/index.btmL 

http://www. cnn.com/2000/US/06/05/dday.remembrance/indexT^tm4- 
http://www. cnn.com/2000/US/06/05/helicopter.escape.03/index. 

http://www. cnn.com/2000/US/06/05/curbing.terrorism.02/index. 



Please replace paragraph page 38 line 9 to page 40 line 4 with the 
following amended paragraph: 

"http://www.asah i .eom/buo i noGG/updato/1 01 9/002.htm l " 
In order to generate an appropriate matching pattern even in the case as 
described above, the present invention introduces a concept titled "adjacent 
structured/hierarchical contents to a target content." The adjacent 
structured/hierarchical contents are structured/hierarchical contents, which have 
URLs analogous to that of the target content and are made to belong to the 
same group as that of the target content in the case of a matching 
determination by means of the matching pattern. The analogous range of the 
URLs is varied depending on the extent to which the author determines that 
differences of the structured/hierarchical contents are allowable and the 
different contents belong to the same group. The URLs include directories 
(portions partitioned by forward slashes in the example of the business article in 
Asahi Shimbun) in the respective hierarchies. When the URLs of the contents to 
be determined whether or not they are the adjacent structured/hierarchical 
contents are collated with the URL of the target content, if directories up to a 



Page 5 of 17 



Appl. No. 10,692,222 

Amdt. dated October 21, 2008 

Reply to Office action of September 26, 2008 

predetermined number (one or more) of hierarchies from the uppermost 
hierarchy are identical and only directories in lower hierarchies from the 
hierarchies where the directories are identical, the content portions subjected to 
the determination may be determined as adjacent content portions. Specific 
examples of the adjacent content portions are listed as follows. In the next 
cases, the structured/hierarchical contents subjected to the determination are 
determined as the adjacent structured/hierarchical contents, (a) Only a portion 
recognized as a date in the URL differs from that of the target content. In the 
above-described example of the business article of Asahi Shimbun, the 
relevant portion is "1019." (b) Only a portion used as numbering in the URL 
differs from that of the target content. In the above-described example 
of the business article of Asahi Shimbun, the relevant portion is "002.html." 
(c) Only the above-described (a) and (b) differ from those of the target content. 

Please replace paragraph page 77 lines 14 to 20 with the following 
amended paragraph: 

The case where the past page is not present occurs not only when the 
caching of the past pages is not performed but also when the URLs are 
generated every day. For example, in the case where a date is utilized as a part 
of a URL as in a URL of a newspaper article, it is obvious that no past pages 
can be present 

(http://www.asah i .eom/ i nt e rnat i ona l /updat e /1 005/01 0.htm l ) . 

Please delete paragraph page 78 lines 5-7 beginning with "Example:". 

Please delete paragraph page 78 lines 8-9 beginning with "Adjacent URL:". 
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Please replace the paragraph page 96 line 11 to page 99 line 2 with the 
following paragraph: 

The following sample is premised on the implementation as 
descriPed aPove. Note that the following is a description 
example according to the relaxNG format. 
<?xml version="I.O" ?> 

<grammar xmlns="http^/relaxng.org/ns/structure/0.9"> 
<start> 

<element name="taPle"> 

<attribute- name="width"> 

<value>168</value> 

</attribute> 

<zeroOrMore> 

<choice> 

<ref narne=lfreeAttributesTABLE"/> 

</choice> 

</zeroOrMore> 

<element narne="tbody"> 

<zeroOrMore> 

<choice> 

<ref narne=lfreeAttributresTBODY7> 

</choice> 

</zeroOrMore> 

<oneOrMore> 

<element narne="tr"> 

<attribute narne="bgcolor"> 

<value>ffffff</value> 

</attribute> 
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<zeroOrMore> 
<choice> 

<ref narne="freeAttributeTR"/> 

</choice> 

</zeroOrMore> 

<element narne="td"> 

<zeroOrMore> 

<choice> 

<ref name=lfreeAttributesTD"/> 

</choice> 

</zeroOrMore> 

<element name="smaN"> 

<zeroOrMore> 

<choice> 

<ref name=lfreeAttributesSMALL"/> 

</choice> 

</zeroOrMore> 

<element name="a"> 

<zeroOrMore> 

<choice> 

<ref name=lfreeAttributesA"/> 

</choice> 

</zeroOrMore> 

<text/> 

</element> 

</element> 

</element> 

</element> 

</oneOrMore> 
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</element> 
</element> 
</start> 

Please replace the paragraph at page 102 line 8 to page 103 line 20 with the 
following amended paragraph: 

<?xml version="1 .0" encoding="utf-8" ?> 
<rdf :RDF 

xmlns="http^/purl. org/rss/1. 0/" 

xmlns:rdf=" http://www .w3.org/1999/02/22-rdf-syntax-ns#" 
xm: lang="ja"> 

<channel rdf:about= ,: hHp4//news.lycos.co.jp/topics/rss.rdf"> 

<title>News LYCOS Saishin Topics</title> 

<link>http4//news.lycos.co.jp/topics</link> 

<items> 

<rdf:Seq> 

<rdf: 1i 

rdf : resource="httpr//news. lycos . co. jp/topics" /> 

</rdf:Seq> 

</items> 

</channel> 

<item 

rdf: about="httpr//news. lycos . co. jp/topics/society/maff. 

<title>isahayawan kinpaku-no-naka koji saikai</title> 
<link>http4//news. lycos.co.jp/topics/society/maffThtml 
</link> 
<litem> 
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<item rdf: about="http^//news. lycos . co. jp/topics/world/opera-tionTfltmJ"> 

<title>arukaida sento-in amerika-ni-tokoh</title> 

<Nnk>htt^newsJycosxoJp/topics/worlcl/operation T htmi 

</link> 

<litem> 

<item rdf: about="bttp^/news. lycos . co. jp/topics/computer/ms.htm4> 

<title>maikurosofuto kadenbunya shinshutsu</title> 

<Nnk>l : mp^/newsJycos.co.jp/topics/computer/msTtoftl 

</link> 

<litem> 

Please replace the paragraph page 110 line 6 to page 111 line 10 with the 
following paragraph: 

[Other Example 4: Application to Information Aggregator] 
Partial cut out of Web pages and integration of information 
are broadly performed in a portal construction system such 
as the IBM Portal Server and an information 
extraction/submission system such as the IBM mySiteOutliner. 
The present invention is applicable to these systems. For 
example, in the IBM mySiteOutliner, XPath as below is held 
in a definition file in order to extract a headline link 
list from the Web page. 
<ClippingDefinition> 
<id>2</id> 
<links> 

<linktitle="Club IBM Top 

Page> http://www. ibm.com/jp/pc/ clubibm/index7ttimi</link> 
</links> 
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<urldata> 

<url source="Club IBM"> http://www. ibm.com/jp/pc/ 

clubibm/indexTMml</url> 
<xpathlists> 

xpath name=l/body textl/> 

/html[1 ]/body[1 ]/table[2]/tbody[1 ]/tr[1 ]/td[2]/table[2]/tbod 

y[1 ]/tr[5]/td[2]/table[1 ]/tbody[1 ]/tr[1 ]/td[1 ]/table[2]/tbod 

y[1]/tr[2]/td[1] 

</xpath> 

</xpathlists> 

</urldata> 

</ClippingDefinition> 
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